3,759 research outputs found
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Value-based reinforcement-learning algorithms provide state-of-the-art
results in model-free discrete-action settings, and tend to outperform
actor-critic algorithms. We argue that actor-critic algorithms are limited by
their need for an on-policy critic. We propose Bootstrapped Dual Policy
Iteration (BDPI), a novel model-free reinforcement-learning algorithm for
continuous states and discrete actions, with an actor and several off-policy
critics. Off-policy critics are compatible with experience replay, ensuring
high sample-efficiency, without the need for off-policy corrections. The actor,
by slowly imitating the average greedy policy of the critics, leads to
high-quality and state-specific exploration, which we compare to Thompson
sampling. Because the actor and critics are fully decoupled, BDPI is remarkably
stable, and unusually robust to its hyper-parameters. BDPI is significantly
more sample-efficient than Bootstrapped DQN, PPO, and ACKTR, on discrete,
continuous and pixel-based tasks. Source code:
https://github.com/vub-ai-lab/bdpi.Comment: Accepted at the European Conference on Machine Learning 2019 (ECML
Preferred Basis in a Measurement Process
The effect of decoherence is analysed for a free particle, interacting with
an environment via a dissipative coupling. The interaction between the particle
and the environment occurs by a coupling of the position operator of the
particle with the environmental degrees of freedom. By examining the exact
solution of the density matrix equation one finds that the density matrix
becomes completely diagonal in momentum with time while the position space
density matrix remains nonlocal. This establishes the momentum basis as the
emergent 'preferred basis' selected by the environment which is contrary to the
general expectation that position should emerge as the preferred basis since
the coupling with the environment is via the position coordinate.Comment: Standard REVTeX format, 10 pages of output. Accepted for publication
in Phys. Rev
Bandit Models of Human Behavior: Reward Processing in Mental Disorders
Drawing an inspiration from behavioral studies of human decision making, we
propose here a general parametric framework for multi-armed bandit problem,
which extends the standard Thompson Sampling approach to incorporate reward
processing biases associated with several neurological and psychiatric
conditions, including Parkinson's and Alzheimer's diseases,
attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain.
We demonstrate empirically that the proposed parametric approach can often
outperform the baseline Thompson Sampling on a variety of datasets. Moreover,
from the behavioral modeling perspective, our parametric framework can be
viewed as a first step towards a unifying computational model capturing reward
processing abnormalities across multiple mental conditions.Comment: Conference on Artificial General Intelligence, AGI-1
First-Stage Development of the Pitjantjatjara Translation of the World Health Organization’s Alcohol, Smoking and Substance Involvement Screening Test (ASSIST)
Substance use is a leading contributor to global disease, illness and death. Compared with non-Indigenous Australians, Aboriginal and Torres Strait Islander Australians are at an increased risk of substance-related harms due to the experience of additional social, cultural, and economic factors. While preventive approaches, including screening and early interventions are promising, currently there are limited options available to healthcare workers that are culturally appropriate for use in Aboriginal and Torres Strait Islander populations. Therefore, the aim of this research was to translate and culturally adapt the World Health Organization endorsed, Alcohol, Smoking and Substance Involvement Screening Test (ASSIST) into Pitjantjatjara. This paper first describes the process of translation and adaptation of the instrument (Phase 1). The process of focus-group testing the translated instrument for accuracy and cultural appropriateness is also discussed (Phase 2). Key findings from both phases are presented in the context of how the research team worked with key stakeholders in the community to identify facilitators and work through barriers to implementation. The findings from this paper will be used to inform the development of a digital, app-based version of the instrument for the purposes of pilot-testing and validation
A Multi-Armed Bandit to Smartly Select a Training Set from Big Medical Data
With the availability of big medical image data, the selection of an adequate
training set is becoming more important to address the heterogeneity of
different datasets. Simply including all the data does not only incur high
processing costs but can even harm the prediction. We formulate the smart and
efficient selection of a training dataset from big medical image data as a
multi-armed bandit problem, solved by Thompson sampling. Our method assumes
that image features are not available at the time of the selection of the
samples, and therefore relies only on meta information associated with the
images. Our strategy simultaneously exploits data sources with high chances of
yielding useful samples and explores new data regions. For our evaluation, we
focus on the application of estimating the age from a brain MRI. Our results on
7,250 subjects from 10 datasets show that our approach leads to higher accuracy
while only requiring a fraction of the training data.Comment: MICCAI 2017 Proceeding
Production of Androgens by Microbial Transformation of Progesterone in Vitro: A Model for Androgen Production in Rivers Receiving Paper Mill Effluent
We have previously documented the presence of progesterone and androstenedione in the water column and bottom sediments of the Fenholloway River, Taylor County, Florida. This river receives paper mill effluent and contains masculinized female mosquitofish. We hypothesized that plant sterols (e.g., β-sitosterol) derived from the pulping of pine trees are transformed by bacteria into progesterone and subsequently into 17α-hydroxyprogesterone, androstenedione, and other androgens. In this study, we demonstrate that these same androgens can be produced in vitro from the bacterium Mycobacterium smegmatis. In a second part to this study, we reextracted and reanalyzed the sediment from the Fenholloway River and verified the presence of androstadienedione, a Δ1 steroid with androgen activity
Examining Muscle Activity Differences During Single and Dual Vector Elastic Resistance Exercises
# Background
Elastic resistance exercise is a common part of rehabilitation programs. While these exercises are highly prevalent, little information exists on how adding an additional resistance vector with a different direction from the primary vector alters muscle activity of the upper extremity.
# Purpose
The purpose of this study was to examine the effects of dual vector exercises on torso and upper extremity muscle activity in comparison to traditional single vector techniques.
# Study Design
Repeated measures design.
# Methods
Sixteen healthy university-aged males completed four common shoulder exercises against elastic resistance (abduction, flexion, internal rotation, external rotation) while using a single or dual elastic vector at a fixed cadence and standardized elastic elongation. Surface electromyography was collected from 16 muscles of the right upper extremity. Mean, peak and integrated activity were extracted from linear enveloped and normalized data and a 2-way repeated measures ANOVA examined differences between conditions.
# Results
All independent variables differentially influenced activation. Interactions between single/dual vectors and exercise type affected mean activation in 11/16 muscles, while interactions in peak activation existed in 7/16 muscles. Adding a secondary vector increased activation predominantly in flexion or abduction exercises; little changes existed when adding a second vector in internal and external rotation exercises. The dual vector exercise in abduction significantly increased mean activation in lower trapezius by 25.6 ± 8.11 %MVC and peak activation in supraspinatus by 29.4 ± 5.94 %MVC (p<0.01). Interactions between single/dual vectors and exercise type affected integrated electromyography for most muscles; the majority of these muscles had the highest integrated electromyography in the dual vector abduction condition.
# Conclusion
Muscle activity often increased with a second resistance vector added; however, the magnitude was exercise-dependent. The majority of these changes existed in the flexion and abduction exercises, with little differences in the internal or external rotation exercises.
# Level of Evidence
3
- …